MetaQuerier over the Deep Web: Shallow Integration across Holistic Sources

نویسندگان

  • Kevin Chen-Chuan Chang
  • Bin He
  • Zhen Zhang
چکیده

The Web has been rapidly “deepened” by myriad searchable databases online. To enable effective access to the “deep Web,” we are building the MetaQuerier– for exploring and integrating databases on the Web. Such metaquerying must tackle integration at a large scale (as sources are proliferating online) and of a dynamic nature (as each query will access different sources). Toward such integration, our approach hinges on the insight that the challenge of large scale is itself an opportunity: We observe that the desired “semantics” often connects to surface presentation characteristics, through some hidden regularities over many sources. Generalizing our recent works, this paper thus proposes our approach of shallow integration across holistic sources – to discover desired semantics by exploiting the hidden regularities of shallow clues across many sources holistically. As evidences, we have studied two concrete problems: 1) query-interface understanding based on hidden syntax, and 2) query-interface matching based on hidden statistic models. Our experience indicates high promise for employing shallow techniques across holistic sources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward Large Scale Integration: Building a MetaQuerier over Databases on the Web

The Web has been rapidly “deepened” by myriad searchable databases online, where data are hidden behind query interfaces. Toward large scale integration over this “deep Web,” we have been building the MetaQuerier system– for both exploring (to find) and integrating (to query) databases on the Web. As an interim report, first, this paper proposes our goal of the MetaQuerier for Web-scale integra...

متن کامل

Discovering Attribute Locality across the Deep Web: an Ordering-Based Approach

The large number of structured database sources on the Web presents pressing need for information integration at a large scale. How can we enable systematic access to this “deep Web”? We observe that, while autonomous sources are seemingly independent, their query schemas often reveal certain correlations, such that sources in the same structured domain (e.g., books, cars) tend to share a “loca...

متن کامل

On-the-Fly Constraint Mapping across Web Query Interfaces

Recently, the Web has been rapidly “deepened” with the prevalence of databases online and becomes an important frontier for data integration. On this deep Web, a significant amount of information can only be accessed as response to dynamically issued queries to the query interface of a back-end database, instead of by traversing static URL links. Such a query interface expresses a set of constr...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004